Search CORE

16 research outputs found

Spread of hate speech in online social media

Author: Dutt Ritam
Goyal Pawan
Mathew Binny
Mukherjee Animesh
Publication venue
Publication date: 04/12/2018
Field of study

The present online social media platform is afflicted with several issues, with hate speech being on the predominant forefront. The prevalence of online hate speech has fueled horrific real-world hate-crime such as the mass-genocide of Rohingya Muslims, communal violence in Colombo and the recent massacre in the Pittsburgh synagogue. Consequently, It is imperative to understand the diffusion of such hateful content in an online setting. We conduct the first study that analyses the flow and dynamics of posts generated by hateful and non-hateful users on Gab (gab.com) over a massive dataset of 341K users and 21M posts. Our observations confirms that hateful content diffuse farther, wider and faster and have a greater outreach than those of non-hateful users. A deeper inspection into the profiles and network of hateful and non-hateful users reveals that the former are more influential, popular and cohesive. Thus, our research explores the interesting facets of diffusion dynamics of hateful users and broadens our understanding of hate speech in the online world.Comment: 8 pages, 5 figures, and 4 tabl

arXiv.org e-Print Archive

Crossref

Linguistic representations for fewer-shot relation extraction across domains

Author: Dutt Ritam
Gururaja Sireesh
Liao Tinglong
Rose Carolyn
Publication venue
Publication date: 07/07/2023
Field of study

Recent work has demonstrated the positive impact of incorporating linguistic representations as additional context and scaffolding on the in-domain performance of several NLP tasks. We extend this work by exploring the impact of linguistic representations on cross-domain performance in a few-shot transfer setting. An important question is whether linguistic representations enhance generalizability by providing features that function as cross-domain pivots. We focus on the task of relation extraction on three datasets of procedural text in two domains, cooking and materials science. Our approach augments a popular transformer-based architecture by alternately incorporating syntactic and semantic graphs constructed by freely available off-the-shelf tools. We examine their utility for enhancing generalization, and investigate whether earlier findings, e.g. that semantic representations can be more helpful than syntactic ones, extend to relation extraction in multiple domains. We find that while the inclusion of these graphs results in significantly higher performance in few-shot transfer, both types of graph exhibit roughly equivalent utility.Comment: ACL 202

arXiv.org e-Print Archive

Improving Broad-Coverage Medical Entity Linking with Semantic Type Prediction and Large-Scale Datasets

Author: Dutt Ritam
Joshi Rishabh
Newman-Griffis Denis
Rose Carolyn
Vashishth Shikhar
Publication venue
Publication date: 11/02/2021
Field of study

Medical entity linking is the task of identifying and standardizing medical concepts referred to in an unstructured text. Most of the existing methods adopt a three-step approach of (1) detecting mentions, (2) generating a list of candidate concepts, and finally (3) picking the best concept among them. In this paper, we probe into alleviating the problem of overgeneration of candidate concepts in the candidate generation module, the most under-studied component of medical entity linking. For this, we present MedType, a fully modular system that prunes out irrelevant candidate concepts based on the predicted semantic type of an entity mention. We incorporate MedType into five off-the-shelf toolkits for medical entity linking and demonstrate that it consistently improves entity linking performance across several benchmark datasets. To address the dearth of annotated training data for medical entity linking, we present WikiMed and PubMedDS, two large-scale medical entity linking datasets, and demonstrate that pre-training MedType on these datasets further improves entity linking performance. We make our source code and datasets publicly available for medical entity linking research.Comment: 35 page

arXiv.org e-Print Archive

PubMed Central

White Rose Research Online

NARMADA: Need and Available Resource Managing Assistant for Disasters and Adversities

Author: Dutt Ritam
Ghosh Kripabandhu
Ghosh Saptarshi
Hiware Kaustubh
Patro Sohan
Sinha Sayan
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2020
Field of study

Although a lot of research has been done on utilising Online Social Media during disasters, there exists no system for a specific task that is critical in a post-disaster scenario -- identifying resource-needs and resource-availabilities in the disaster-affected region, coupled with their subsequent matching. To this end, we present NARMADA, a semi-automated platform which leverages the crowd-sourced information from social media posts for assisting post-disaster relief coordination efforts. The system employs Natural Language Processing and Information Retrieval techniques for identifying resource-needs and resource-availabilities from microblogs, extracting resources from the posts, and also matching the needs to suitable availabilities. The system is thus capable of facilitating the judicious management of resources during post-disaster relief operations.Comment: ACL 2020 Workshop on Natural Language Processing for Social Media (SocialNLP

arXiv.org e-Print Archive

Crossref